Depth Creates No Bad Local Minima

نویسندگان

  • Haihao Lu
  • Kenji Kawaguchi
چکیده

In deep learning, depth, as well as nonlinearity, create non-convex loss surfaces. Then, does depth alone create bad local minima? In this paper, we prove that without nonlinearity, depth alone does not create bad local minima, although it induces non-convex loss surface. Using this insight, we greatly simplify a recently proposed proof to show that all of the local minima of feedforward deep linear neural networks are global minima. Our theoretical result generalizes previous results with fewer assumptions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The loss surface and expressivity of deep convolutional neural networks

We analyze the expressiveness and loss surface of practical deep convolutional neural networks (CNNs) with shared weights. We show that such CNNs produce linearly independent features (and thus linearly separable) at every “wide” layer which has more neurons than the number of training samples. This condition holds e.g. for the VGG network. Furthermore, we provide for such wide CNNs necessary a...

متن کامل

Local minima in training of deep networks

There has been a lot of recent interest in trying to characterize the error surface of deep models. This stems from a long standing question. Given that deep networks are highly nonlinear systems optimized by local gradient methods, why do they not seem to be affected by bad local minima? It is widely believed that training of deep models using gradient methods works so well because the error s...

متن کامل

Non-Contact Pulmonary Functional Testing Through an Improved Photometric Stereo Approach

A non-contact computer vision based system is developed for Pulmonary Functional Testing. The unique and novel features of the system are that it views the patients from both front and back and creates a 3D structure of the whole torso. By observing the 3D structure of the torso over time, the amount of air inhaled and exhaled is estimated. The Photometric Stereo method is used to recover local...

متن کامل

Towards Effective Low-bitwidth Convolutional Neural Networks

In this work, we aims to effectively train convolutional neural networks with both low-bitwidth weights and low-bitwidth activations. Optimization of a lowprecision network is typically extremely unstable and it is easily trapped in a bad local minima, which results in noticeable accuracy loss. To mitigate this problem, we propose two novel approaches. On one hand, unlike previous methods that ...

متن کامل

No bad local minima: Data independent training error guarantees for multilayer neural networks

We use smoothed analysis techniques to provide guarantees on the training loss of Multilayer Neural Networks (MNNs) at differentiable local minima. Specifically, we examine MNNs with piecewise linear activation functions, quadratic loss and a single output, under mild over-parametrization. We prove that for a MNN with one hidden layer, the training error is zero at every differentiable local mi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1702.08580  شماره 

صفحات  -

تاریخ انتشار 2017